3D Generation: InstantMesh Training, Inference, and Eval Scripts #681

HaFred · 2024-10-01T08:27:14Z

This PR implements InstantMesh for 3D meshing using multiview images. The torch-cuda version with textures is shown below.

Input is nx3xhxw multiview video and the output is a 3D mesh mesh.obj file with vertices, faces.
HW environment specs are as follows.

Mindspore Version:               2.3.1 release
CANN Version:                    7.3
Ascend Driver:                   23.0.rc3.6

3D Mesh Demos

Inference demos with the pretrained checkpoint can be found in the readme, with any HTML renderer such as VSCode it can be viewed and interacted with smoothly as shown below.

These links can be found here for easy access.

akun
anya

Input multiview images are illustrated here, respectively. Kindly notice that the input multiview images can be either retrieved viav the SV3D pipeline or Zero123++ pipeline as in the original implementation, the paper's core contribution is the process of 3D meshing out of multiview images.

Limitations in Inference

1. Cache Miss in `mint.unique()` and other ACLNN Operators

As introduced in the Readme, InstantMesh extracts isosurface for meshing using FlexiCubes, which essentially requires a unique operation to determine the surfaces of an object. Unfortunately, the operator mint.unique() remains not supported by ACLNN but only AICPU, and hence leads to the program being stuck by cache misses, as told by the MindSpore framework colleagues. Once the mint.unique() is fully supported by A+M, we will implement the 3D meshing with FlexiCubes for higher resolution.

(Update on August 1st: Turns out the CANN operator aclnnUniqueDim takes too long. Interacting with the framework and CANN team for solutions.)

Workaround: 3D Meshing with the Raw Triplane Features using Marching Cubes

For the reasons above, we have to find a workaround to extract 3D meshes from multiview images (from SV3D in our case). Here in this PR, since we already have a rough SDF extracted from the SDF MLP heads with the triplane features input, a straightforward way is to take the rough SDF and feed it to the classic isosurface extraction method, such as the Marching Cubes. As the optimization in the Marching Cubes lacks degrees of freedom to represent high-quality meshes, it tends to use more vertices (naturally more faces, triangles) to fit an irregular 3D shape, especially when the 3D shape cannot be approximated to a surface. Details can be found in the FlexiCubes' paper.

FlexiCubes	Marching Cubes

2. CUDA Extension for Rasterization

InstantMesh uses nvdiffrast for uv map rasterization and 2D rendering for the FlexiCube 3D volume. Bypass this for now.

…vided

…port

…mh_oct1

…lback, and some refactoring

…mh_oct1

SamitHuang · 2024-10-25T06:36:46Z

examples/instantmesh/README.md

+
+The illustrations here are better viewed in viewers than with HTML support (e.g., the vscode built-in viewer).
+
+## Environments


Pls refactor this section referring to
https://github.com/mindspore-lab/mindone/wiki/%E6%A8%A1%E5%9E%8B%E7%89%88%E6%9C%AC%E4%B9%A6%E5%86%99%E6%A0%B7%E4%BE%8B

…ng for loggers

…age 1 ckpt with eval.py seamlessly

instantmesh inference & stage 1 training, also the eval script is pro…

d10d44d

…vided

HaFred requested review from CaitinZhao, SamitHuang and zhanghuiyao as code owners October 1, 2024 08:27

HaFred added 6 commits October 1, 2024 16:38

update readme

57b66d7

putting on renderer utils

b4054ed

fixes about fmt and f-string issue in precommit check and mindone im…

03ece27

…port

Merge branch 'mindspore-lab:master' into itmh_oct1

152f310

Merge branch 'itmh_oct1' of https://github.com/HaFred/mindone into it…

d5a58e2

…mh_oct1

supporting cosine_annealing_warm_restarts_lr and top_k saving ckptcal…

35d27c3

…lback, and some refactoring

HaFred requested a review from vigo999 as a code owner October 8, 2024 08:13

HaFred added 8 commits October 15, 2024 09:45

housekeeping

1fc4214

Merge branch 'mindspore-lab:master' into itmh_oct1

37d961f

revert to f-string while meeting flake8 constraints

70663da

Update README.md

7073b97

lpips loss alignment

f939579

Merge branch 'mindspore-lab:master' into itmh_oct1

f5ea787

put on mindcv version

0eb7641

Merge branch 'itmh_oct1' of https://github.com/HaFred/mindone into it…

4206d0d

…mh_oct1

SamitHuang reviewed Oct 25, 2024

View reviewed changes

HaFred added 10 commits October 25, 2024 16:23

update the

efdd423

Merge branch 'mindspore-lab:master' into itmh_oct1

9bf2358

eval output to the same path as the loaded ckpt, also some housekeepi…

f48983c

…ng for loggers

fix the ckpt saving path cfg

0dfbf99

update cfg

53e2216

update arch to support loading vanilla stage 1 ckpt and ms-trained st…

e2c29e1

…age 1 ckpt with eval.py seamlessly

swtich ops to mint AMAP

6ac74a9

Merge branch 'mindspore-lab:master' into itmh_oct1

71b70f3

upload the safetensor conversion snippet mentioned in the readme

59697a8

Merge branch 'mindspore-lab:master' into itmh_oct1

88a3e9d

update link

968a03b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3D Generation: InstantMesh Training, Inference, and Eval Scripts #681

3D Generation: InstantMesh Training, Inference, and Eval Scripts #681

HaFred commented Oct 1, 2024 •

edited

Loading

SamitHuang Oct 25, 2024


		The illustrations here are better viewed in viewers than with HTML support (e.g., the vscode built-in viewer).

		## Environments

3D Generation: InstantMesh Training, Inference, and Eval Scripts #681

Are you sure you want to change the base?

3D Generation: InstantMesh Training, Inference, and Eval Scripts #681

Conversation

HaFred commented Oct 1, 2024 • edited Loading

3D Mesh Demos

Limitations in Inference

1. Cache Miss in mint.unique() and other ACLNN Operators

Workaround: 3D Meshing with the Raw Triplane Features using Marching Cubes

2. CUDA Extension for Rasterization

SamitHuang Oct 25, 2024

Choose a reason for hiding this comment

HaFred commented Oct 1, 2024 •

edited

Loading

1. Cache Miss in `mint.unique()` and other ACLNN Operators